internal weight
On the choice of the non-trainable internal weights in random feature maps
Mandal, Pinak, Gottwald, Georg A.
The computationally cheap machine learning architecture of random feature maps can be viewed as a single-layer feedforward network in which the weights of the hidden layer are random but fixed and only the outer weights are learned via linear regression. The internal weights are typically chosen from a prescribed distribution. The choice of the internal weights significantly impacts the accuracy of random feature maps. We address here the task of how to best select the internal weights. In particular, we consider the forecasting problem whereby random feature maps are used to learn a one-step propagator map for a dynamical system. We provide a computationally cheap hit-and-run algorithm to select good internal weights which lead to good forecasting skill. We show that the number of good features is the main factor controlling the forecasting skill of random feature maps and acts as an effective feature dimension. Lastly, we compare random feature maps with single-layer feedforward neural networks in which the internal weights are now learned using gradient descent. We find that random feature maps have superior forecasting capabilities whilst having several orders of magnitude lower computational cost.
Offset Unlearning for Large Language Models
Huang, James Y., Zhou, Wenxuan, Wang, Fei, Morstatter, Fred, Zhang, Sheng, Poon, Hoifung, Chen, Muhao
Despite the strong capabilities of Large Language Models (LLMs) to acquire knowledge from their training corpora, the memorization of sensitive information in the corpora such as copyrighted, harmful, and private content has led to ethical and legal concerns. In response to these challenges, unlearning has emerged as a potential remedy for LLMs affected by problematic training data. However, previous unlearning techniques are either not applicable to black-box LLMs due to required access to model internal weights, or violate data protection principles by retaining sensitive data for inference-time correction. We propose $\delta$-unlearning, an offset unlearning framework for black-box LLMs. Instead of tuning the black-box LLM itself, $\delta$-unlearning learns the logit offset needed for unlearning by contrasting the logits from a pair of smaller models. Experiments demonstrate that $\delta$-unlearning can effectively unlearn target data while maintaining similar or even stronger performance on general out-of-forget-scope tasks. $\delta$-unlearning also effectively incorporates different unlearning algorithms, making our approach a versatile solution to adapting various existing unlearning algorithms to black-box LLMs.
- Asia > Singapore (0.05)
- North America > Canada > Ontario > Toronto (0.05)
- North America > United States > California > Yolo County > Davis (0.04)
- (3 more...)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
Can Machines Be in Language?
In late 2022, large language models (LLMs) erupted into the public spotlight. Pundits were quick to claim LLMs as the next step in the path to artificial general intelligence (AGI) and even the Singularity. LLMs are artificial neural networks (ANN) created by a complex process. First, the core ANN is trained on billions of words of text from the Internet to respond to a prompt with a list of most probable next words after the prompt. Second, the core ANN is then "fine-tuned" by a complex process called "tweaking" to make the core ANN outputs more satisfactory to humans.
Can Machines Be in Language?
In late 2022, large language models (LLMs) erupted into the public spotlight. Pundits were quick to claim LLMs are the next step in the path to artificial general intelligence (AGI) and even the Singularity. LLMs are artificial neural networks (ANN) created by a complex process. First, the core ANN is trained on billions of words of text from the Internet to respond to a prompt with a list of most probable next words after the prompt. Second, the core ANN is then "fine-tuned" by a complex process called "tweaking" to make the core ANN outputs more satisfactory to humans.
Universality and approximation bounds for echo state networks with random weights
We study the uniform approximation of echo state networks with randomly generated internal weights. These models, in which only the readout weights are optimized during training, have made empirical success in learning dynamical systems. Recent results showed that echo state networks with ReLU activation are universal. In this paper, we give an alternative construction and prove that the universality holds for general activation functions. Specifically, our main result shows that, under certain condition on the activation function, there exists a sampling procedure for the internal weights so that the echo state network can approximate any continuous casual time-invariant operators with high probability. In particular, for ReLU activation, we give explicit construction for these sampling procedures. We also quantify the approximation error of the constructed ReLU echo state networks for sufficiently regular operators.
- Asia > China > Hong Kong (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)